237 research outputs found
Query-Driven Sampling for Collective Entity Resolution
Probabilistic databases play a preeminent role in the processing and
management of uncertain data. Recently, many database research efforts have
integrated probabilistic models into databases to support tasks such as
information extraction and labeling. Many of these efforts are based on batch
oriented inference which inhibits a realtime workflow. One important task is
entity resolution (ER). ER is the process of determining records (mentions) in
a database that correspond to the same real-world entity. Traditional pairwise
ER methods can lead to inconsistencies and low accuracy due to localized
decisions. Leading ER systems solve this problem by collectively resolving all
records using a probabilistic graphical model and Markov chain Monte Carlo
(MCMC) inference. However, for large datasets this is an extremely expensive
process. One key observation is that, such exhaustive ER process incurs a huge
up-front cost, which is wasteful in practice because most users are interested
in only a small subset of entities. In this paper, we advocate pay-as-you-go
entity resolution by developing a number of query-driven collective ER
techniques. We introduce two classes of SQL queries that involve ER operators
--- selection-driven ER and join-driven ER. We implement novel variations of
the MCMC Metropolis Hastings algorithm to generate biased samples and
selectivity-based scheduling algorithms to support the two classes of ER
queries. Finally, we show that query-driven ER algorithms can converge and
return results within minutes over a database populated with the extraction
from a newswire dataset containing 71 million mentions
Infection and venous thromboembolism in patients undergoing colorectal surgery: what is the relationship?
BACKGROUND: There is evidence demonstrating an association between infection and venous thromboembolism. We recently identified this association in the postoperative setting; however, the temporal relationship between infection and venous thromboembolism is not well defined
OBJECTIVE: We sought to determine the temporal relationship between venous thromboembolism and postoperative infectious complications in patients undergoing colorectal surgery.
DESIGN, SETTING, AND PATIENTS: A retrospective cohort analysis was performed using data for patients undergoing colorectal surgery in the National Surgical Quality Improvement Project 2010 database.
MAIN OUTCOME MEASURES: The primary outcome measures were the rate and timing of venous thromboembolism and postoperative infection among patients undergoing colorectal surgery during 30 postoperative days.
RESULTS: Of 39,831 patients who underwent colorectal surgery, the overall rate of venous thromboembolism was 2.4% (n = 948); 729 (1.8%) patients were diagnosed with deep vein thrombosis, and 307 (0.77%) patients were diagnosed with pulmonary embolism. Eighty-eight (0.22%) patients were reported as developing both deep vein thrombosis and pulmonary embolism. Following colorectal surgery, the development of a urinary tract infection, pneumonia, organ space surgical site infection, or deep surgical site infection was associated with a significantly increased risk for venous thromboembolism. The majority (52%-85%) of venous thromboembolisms in this population occurred the same day or a median of 3.5 to 8 days following the diagnosis of infection. The approximate relative risk for developing any venous thromboembolism increased each day following the development of each type of infection (range, 0.40%-1.0%) in comparison with patients not developing an infection.
LIMITATIONS: We are unable to account for differences in data collection, prophylaxis, and venous thromboembolism surveillance between hospitals in the database. Additionally, there is limited patient follow-up.
CONCLUSIONS: These findings of a temporal association between infection and venous thromboembolism suggest a potential early indicator for using certain postoperative infectious complications as clinical warning signs that a patient is more likely to develop venous thromboembolism. Further studies into best practices for prevention are warranted
Influence of human impact and bedrock differences on the vegetational history of the Insubrian Southern Alps
Vegetation history for the study region is reconstructed on the basis of pollen, charcoal and AMS14C investigations of lake sediments from Lago del Segrino (calcareous bedrock) and Lago di Muzzano (siliceous bedrock). Late-glacial forests were characterised byBetula andPinus sylvestris. At the beginning of the Holocene they were replaced by temperate continental forest and shrub communities. A special type of temperate lowland forest, withAbies alba as the most important tree, was present in the period 8300 to 4500 B.P. Subsequently,Fagus, Quercus andAlnus glutinosa were the main forest components andA. alba ceased to be of importance.Castanea sativa andJuglans regia were probably introduced after forest clearance by fire during the first century A.D. On soils derived from siliceous bedrock,C. sativa was already dominant at ca. A.D. 200 (A.D. dates are in calendar years). In limestone areas, however,C. sativa failed to achieve a dominant role. After the introduction ofC. sativa, the main trees were initially oak (Quercus spp.) and later the walnut (Juglans regia). Ostrya carpinifolia became the dominant tree around Lago del Segrino only in the last 100–200 years though it had spread into the area at ca. 5000 cal. B.C. This recent expansion ofOstrya is confirmed at other sites and appears to be controlled by human disturbances involving especially clearance. It is argued that these forests should not be regarded as climax communities. It is suggested that under undisturbed succession they would develop into mixed deciduous forests consisting ofFraxinus excelsior, Tilia, Ulmus, Quercus and Acer
Composition of Haar Paraproducts: The Random Case
When is the composition of paraproducts bounded? This is an important, and
difficult question, related to to a question of Sarason on composition of
Hankel matrices, and the two-weight problem for the Hilbert transform. We
consider randomized variants of this question, finding non-classical
characterizations, for dyadic paraproducts.Comment: 13 pages. Submitted. v2: \showkeys commented out, with other minor
change
Symmetries and Asymmetries of B -> K* mu+ mu- Decays in the Standard Model and Beyond
The rare decay B -> K* (-> K pi) mu+ mu- is regarded as one of the crucial
channels for B physics as the polarization of the K* allows a precise angular
reconstruction resulting in many observables that offer new important tests of
the Standard Model and its extensions. These angular observables can be
expressed in terms of CP-conserving and CP-violating quantities which we study
in terms of the full form factors calculated from QCD sum rules on the
light-cone, including QCD factorization corrections. We investigate all
observables in the context of the Standard Model and various New Physics
models, in particular the Littlest Higgs model with T-parity and various MSSM
scenarios, identifying those observables with small to moderate dependence on
hadronic quantities and large impact of New Physics. One important result of
our studies is that new CP-violating phases will produce clean signals in
CP-violating asymmetries. We also identify a number of correlations between
various observables which will allow a clear distinction between different New
Physics scenarios.Comment: 56 pages, 18 figures, 14 tables. v5: Missing factor in eqs. (3.31-32)
and fig. 6 corrected. Minor misprints in eq. (2.10) and table A corrected.
Conclusions unchange
Solvation free energy profile of the SCN- ion across the water-1,2-dichloroethane liquid/liquid interface. A computer simulation study
The solvation free energy profile of a single SCN- ion is calculated across the water-1,2-dichloroethane liquid/liquid interface at 298 K by the constraint force method. The obtained results show that the free energy cost of transferring the ion from the aqueous to the organic phase is about 70 kJ/mol, The free energy profile shows a small but clear well at the aqueous side of the interface, in the subsurface region of the water phase, indicating the ability of the SCN- ion to be adsorbed in the close vicinity of the interface. Upon entrance of the SCN- ion to the organic phase a coextraction of the water molecules of its first hydration shell occurs. Accordingly, when it is located at the boundary of the two phases the SCN- ion prefers orientations in which its bulky S atom is located at the aqueous side, and the small N atom, together with its first hydration shell, at the organic side of the interface
Analysis of Endocrine Disruption in Southern California Coastal Fish Using an Aquatic Multispecies Microarray
BackgroundEndocrine disruptors include plasticizers, pesticides, detergents, and pharmaceuticals. Turbot and other flatfish are used to characterize the presence of chemicals in the marine environment. Unfortunately, there are relatively few genes of turbot and other flatfish in GenBank, which limits the use of molecular tools such as microarrays and quantitative reverse-transcriptase polymerase chain reaction (qRT-PCR) to study disruption of endocrine responses in sentinel fish captured by regulatory agencies.ObjectivesWe fabricated a multigene cross-species microarray as a diagnostic tool to screen the effects of environmental chemicals in fish, for which there is minimal genomic information. The array included genes that are involved in the actions of adrenal and sex steroids, thyroid hormone, and xenobiotic responses. This microarray will provide a sensitive tool for screening for the presence of chemicals with adverse effects on endocrine responses in coastal fish species.MethodsWe used a custom multispecies microarray to study gene expression in wild hornyhead turbot (Pleuronichthys verticalis) collected from polluted and clean coastal waters and in laboratory male zebrafish (Danio rerio) after exposure to estradiol and 4-nonylphenol. We measured gene-specific expression in turbot liver by qRT-PCR and correlated it to microarray data.ResultsMicroarray and qRT-PCR analyses of livers from turbot collected from polluted areas revealed altered gene expression profiles compared with those from nonaffected areas.ConclusionsThe agreement between the array data and qRT-PCR analyses validates this multispecies microarray. The microarray measurement of gene expression in zebrafish, which are phylogenetically distant from turbot, indicates that this multispecies microarray will be useful for measuring endocrine responses in other fish
- …